Exam #1— 5 questions. 20 points each. Show your work.
1) Consider the following three samples of data (be sure to express your answers in the appropriate units):
SAMPLE A 13, 1, 10, 3, 6, 7
SAMPLE B -$2, $8, -$6, $0, -$6, -$12
SAMPLE C 30%, 5%, 20%, -20%, -7.25%
a) Calculate the sample mean for each sample.
b) Calculate the sample medians for each.
c) Calculate the sample variance and standard deviation for each sample.
d) List the samples in order of magnitude of standard deviation from lowest to highest.
2) Consider the following frequency distribution for a sample of 400 observations:
CLASS FREQUENCY
30-39 30
40-49 10
50-59 80
60-69 100
70-79 70
80-89 70
90-99 40
a) Calculate and interpret the sample group mean.
b) Calculate and interpret the sample variance.
c) Calculate and interpret the sample standard deviation.
d) Are these data symmetric?
e) If not, describe whether they are positively or negatively skewed.
f) Construct a histogram of your results in Excel or Stata showing the percent values in the bars of your histogram.
3) Using Stata and the dataset Debt.xlsx (which identifies household debt levels for a sample of 25 households in the Denver metro area) posted on Canvas,
a) Convert the Excel dataset to Stata format and save the new .dta dataset.
b) Calculate the mean, median and mode of the distribution.
b) Calculate the variance and standard deviation of the distribution.
c) Place the summary statistics mean, median, standard deviation, skewness, kurtosis, interquartile range and coefficient of variation into a single table of summary statistics.
d) Construct a relative frequency and cumulative frequency distribution table.
e) Construct a graph of your results sorting the value of debt from lowest to highest.
f) Create a Title for your graph called Denver Consumer Debt and place notes in the graph called Source: DU Survey.
4) Using Excel and the dataset agexports.xlsx (which identifies the value in millions of USD of agricultural exports by states) posted on Canvas,
a) Calculate the mean, median and mode of the distribution.
b) Calculate the variance and standard deviation of the distribution.
c) Calculate the skewness, kurtosis, interquartile range and coefficient of variation.
d) Construct a relative frequency and cumulative frequency distribution table.
e) Construct a graph of your results sorting the value of exports from lowest to highest by state.
f) Create a Title for you graph called US Agricultural Exports and place notes called Source: USDA.
5) Suppose you take a random sample of 40 textbooks at the DU Bookstore and discover that the cost of Economics textbooks at DU has a mean value of 206.25USD with a standard deviation of 20.84USD. If the distribution of textbook costs is approximately normal:
a) At least 60% of these values will be in what interval?
b) At least 68% of these values will be in what interval?
c) At least 89% of these values will be in what interval?
Exam #1— Choose 4 of 7 questions. 25 points each.
1) Consider the following four samples of data:
SAMPLE A 1,2,3,4,5,6,7,8
SAMPLE B 1,1,1,1,8,8,8,8
SAMPLE C 1,1,4,4,5,5,8,8
SAMPLE D -6,-3,0,3,6,9,12,15
a) Calculate the sample mean for each sample.
Each mean = 4.5.
b) Do you notice anything unusual about the means for these samples? If so, what do you notice?
The means are all the same, which is unusual.
b) Calculate the sample medians for each.
c) Calculate the sample variance and standard deviation for each sample.
d) List the samples in order of magnitude of standard deviation from lowest to highest.
sample (a) has the least variability, then sample (c), followed by sample (b) and then sample (d).
2) Consider the following frequency distribution for a sample of 40 observations:
CLASS FREQUENCY
0-4 5
5-9 8
10-14 11
15-19 9
20-24 7
a) Calculate and interpret the sample group mean.
Class
0-4 2 5 10
5-9 7 8 56
10-14 12 11 132
15-19 17 9 153
20-24 22 7 154
40 505
b) Calculate and interpret the sample variance.
Class
0-4 2 5 10 -10.625 112.8906 564.4531
5-9 7 8 56 -5.625 31.64063 253.125
10-14 12 11 132 -0.625 0.390625 4.296875
15-19 17 9 153 4.375 19.14063 172.2656
20-24 22 7 154 9.375 87.89063 615.2344
40 505 1609.375
Sample Variance = = 1609.375 / 39 = 41.267
c) Calculate and interpret the sample standard deviation.
Sample Std. Deviation = = 6.424
d) Are these data symmetric? If not, describe whether they are positively or negatively skewed.
3) An Auditor for a finance ministry in a developing country finds that the value of bribes paid to public officials in a given year has a mean value of 295 (measured in USD) and a standard deviation of 63USD.
a) At least 60% of these values will be in what interval?
Find a range in which it can be guaranteed that 60% of the values lie.
Use Chebychev’s theorem: at least 60% = [1-(1/k2)]. Solving for k = 1.58. The interval will range from 295 +/- (1.58)·(63) = 295 +/- 99.54. 195.46 up to 394.54 will contain at least 60% of the observations.
b) At least 84% of these values will be in what interval?
Find the range in which it can be guaranteed that 84% of the growth figures lie
Use Chebychev’s theorem: at least 84% = [1-(1/k2)]. Solving for k = 2.5. The interval will range from 295 +/- (2.50)·(63) = 295 +/- 157.5. 137.50 up to 452.50 will contain at least 84% of the observations
4) Suppose you are given the Excel dataset Student GPA posted on Blackboard which contains a random sample on graduating GPA and entering scores on the verbal part of the SAT for 67 graduating seniors at the University of Denver.
a) Describe the data graphically using Stata.
b) Describe the data numerically using Stata.
c) Is there a statistical relation between graduating GPA and Verbal SAT scores? How would you describe that relation?
. correlate GPA SATverb
(obs=67)
| GPA SATverb
————-+——————
GPA | 1.0000
SATverb | 0.5603 1.0000
So there is a positive correlation here.
5) A recent study on the relation between alcoholic beverage consumption and the onset of stomach ulcers is presented in the following probability table:
Alcoholic Drinks per Day Ulcer No Ulcer
0 .01 .22
1 .03 .19
2 .03 .32
3 or more .04 .16
a) Are drinking alcohol and developing ulcers independent events? How do you know?
No, because Pr(drinking alcohol) Pr(drinking alcohol | ulcer)
b) What proportion of people develops ulcers?
Pr(ulcer) = .01 + .03 + .03 + .04 = .11
c) What is the probability that a teetotaler (no alcoholic drinks) develops an ulcer?
Pr(ulcer | none) =
d) What is the probability that someone who has an ulcer does not drink alcohol?
Pr(none | ulcer) =
e) What is the probability that someone who has an ulcer drinks alcohol?
Pr(One, two, or more than two | no ulcer) =
6) A recent report suggests that the Credit scorecards used by commercial banks to grant loans contain more than one error on average. Nevertheless, banks continue to use these scorecards in their analysis of credit worthiness. Data on loan performance for a sample commercial bank is presented in the following probability table:
Credit Score of Credit Score of
Loan Performance Under 400 400 or more
Repaid .19 .64
Defaulted .13 .04
a) Are credit scores and loan performance independent events? How do you know?
No, because Pr(fully repaid) Pr(fully repaid | under 400)
b) What proportions of loans are fully repaid?
Pr(fully repaid) = .19 + .64 = .83
c) What proportion of loans given to scorers of less than 400 are fully repaid?
Pr(fully repaid | under 400) =
d) What proportion of loans given to scorers of 400 or more fully repay?
Pr(fully repaid | 400 or more) =
7) Suppose the number of pizzas delivered to DU students each month is a random variable with the following probability distribution:
x 0 1 2 3
Pr(x) .10 .30 .40 .20
a) Calculate the mean number of pizzas delivered to students per month.
x = E(X) = = 0(.1) + 1(.3) + 2(.4) + 3(.2) = 1.7
b) Calculate the standard deviation of pizzas delivered to students per month.
= Var(X) = = (0–1.7) (.1) + (1–1.7) (.3) + (2–1.7) (.4) + (3–1.7) (.2) = .81
And:
S.D.(X) = sx = .90
c) Calculate the mean profits per student if the pizzeria makes a per unit profit of 3USD per pizza.
E(Profit) = E(5X) = 3•E(X) = 3(1.7) = 5.1
d ) Calculate the standard deviation of profits per student.
Var(Profit) = Var(3X) = 3 Var(X) = 9(.81) = 7.29
And:
S.D.(Profit) = sx = 2.70
StataCorp LP 4905 Lakeway Drive 800 STATAPC FAX 979 696 4601
College Station 800 782 8272
Texas 77845 979 696 4600
Stata
License and Activation Key Stata Software
This License and Activation Key is an important document. Three pieces of information on this document
— the serial number, code, and authorization — will be required during the installation process of the
software you have licensed from StataCorp LP (“StataCorp”), as further described below.
This document is valuable proof of purchase for your software license. You should store this document
in a safe place for record keeping.
Licensed software: Stata/IC 14
License type: Single-user
License term: Expires 08/08/2016
Serial number: 301409008351
Code: c4o7 r2bf impo nnLf a3yr L5rc q39d hkhf b6
Authorization: urok
L represents the letter L 1 represents the number one
o represents the letter o 0 represents the number zero
Note: When you type your code and authorization into the computer, you may type in lowercase or
uppercase letters; it does not matter. You may also omit or add spaces as you wish.
Notice to user: Please read the following carefully. By installing the software and entering the Activation
Key set forth in this document into your computer and unlocking the software, you confirm that you
agree to be bound by the terms and conditions of StataCorp’s then-current Software End-User License
Agreement (as amended from time to time) (“Software License”). If you do not agree to be bound by all
the terms and conditions of the Software License, do not enter the codes in this document or otherwise
install or use the software. You must return the software to StataCorp or the authorized reseller from
whom you purchased it, and your money will be refunded if returned within 30 days of the date of
purchase.
The terms of the Software License can be accessed on the StataCorp website at the following URL:
http://www.stata.com/order/end-user-license-agreement. An electronic copy of the Software License is
also included on the installation media. This software is protected by United States copyright law and
international treaty provisions.